Subgraph Rank: PageRank for Subgraph-Centric Distributed Graph Processing
نویسندگان
چکیده
The growth of Big Data has seen the increasing prevalence of interconnected graph datasets that reflect the variety and complexity of emerging data sources. Recent distributed graph processing platforms offer vertex-centric and subgraphcentric abstractions to compose and execute graph analytics on commodity clusters and Clouds. Näıve translation of existing graph algorithms to these programming models can offer sub-optimal performance. We analyze the effectiveness of PageRank, a popular graph centrality measure, for a subgraph-centric programming model, and propose variations based on the existing BlockRank algorithm to improve the performance. We evaluate these algorithms on real-world graphs using the GoFFish platform on Amazon EC2 Cloud VMs, and demonstrate that the proposed Subgraph Rank algorithm outperforms the native PageRank and BlockRank algorithms, and is faster by 23 − 74% for most graphs we evaluated, while achieving an equivalent PageRank quality.
منابع مشابه
SubgraphRank: PageRank Approximation for a Subgraph or in a Decentralized System
PageRank, a ranking metric for hypertext web pages, has received increased interests. As the Web has grown in size, computing PageRank scores on the whole web using centralized approaches faces challenges in scalability. Distributed systems like peer-to-peer(P2P) networks are employed to speed up PageRank. In a P2P system, each peer crawls web fragments independently. Hence the web fragment on ...
متن کاملThe principal ideal subgraph of the annihilating-ideal graph of commutative rings
Let $R$ be a commutative ring with identity and $mathbb{A}(R)$ be the set of ideals of $R$ with non-zero annihilators. In this paper, we first introduce and investigate the principal ideal subgraph of the annihilating-ideal graph of $R$, denoted by $mathbb{AG}_P(R)$. It is a (undirected) graph with vertices $mathbb{A}_P(R)=mathbb{A}(R)cap mathbb{P}(R)setminus {(0)}$, where $mathbb{P}(R)$ is...
متن کاملLabeling Subgraph Embeddings and Cordiality of Graphs
Let $G$ be a graph with vertex set $V(G)$ and edge set $E(G)$, a vertex labeling $f : V(G)rightarrow mathbb{Z}_2$ induces an edge labeling $ f^{+} : E(G)rightarrow mathbb{Z}_2$ defined by $f^{+}(xy) = f(x) + f(y)$, for each edge $ xyin E(G)$. For each $i in mathbb{Z}_2$, let $ v_{f}(i)=|{u in V(G) : f(u) = i}|$ and $e_{f^+}(i)=|{xyin E(G) : f^{+}(xy) = i}|$. A vertex labeling $f$ of a graph $G...
متن کاملGoFFish: A Framework for Distributed Analytics over Timeseries Graphs
Massive datasets from scientific instruments and enterprises were the initial Big Data frontiers. But these are being subsumed by complex, high-velocity data from ubiquitous sensors and social network streams. Such datasets are characterized by both temporal attributes and lateral relationships between them forming a graph structure, and scalable data analytics frameworks have not been adequate...
متن کاملDistributed Graph Layout for Scalable Small-world Network Analysis
The in-memory graph layout or organization has a considerable impact on the time and energy efficiency of distributed memory graph computations. It affects memory locality, inter-task load balance, communication time, and overall memory utilization. Graph layout could refer to partitioning or replication of vertex and edge arrays, selective replication of data structures that hold meta-data, an...
متن کامل